Overview

Dataset statistics

Number of variables23
Number of observations8123
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.4 MiB
Average record size in memory184.0 B

Variable types

Categorical14
Numeric9

Warnings

85 has constant value "85" Constant
1 is highly correlated with 23 and 6 other fieldsHigh correlation
23 is highly correlated with 1 and 5 other fieldsHigh correlation
25 is highly correlated with 1 and 4 other fieldsHigh correlation
34 is highly correlated with 67 and 2 other fieldsHigh correlation
38 is highly correlated with 1 and 1 other fieldsHigh correlation
40 is highly correlated with 25 and 4 other fieldsHigh correlation
54 is highly correlated with 25 and 3 other fieldsHigh correlation
59 is highly correlated with 1 and 3 other fieldsHigh correlation
63 is highly correlated with 1 and 3 other fieldsHigh correlation
67 is highly correlated with 34 and 1 other fieldsHigh correlation
76 is highly correlated with 34 and 2 other fieldsHigh correlation
86 is highly correlated with 34 and 1 other fieldsHigh correlation
93 is highly correlated with 1 and 3 other fieldsHigh correlation
98 is highly correlated with 1 and 4 other fieldsHigh correlation
107 is highly correlated with 113High correlation
113 is highly correlated with 40 and 2 other fieldsHigh correlation
1 is highly correlated with 23 and 6 other fieldsHigh correlation
23 is highly correlated with 1 and 5 other fieldsHigh correlation
25 is highly correlated with 1 and 6 other fieldsHigh correlation
34 is highly correlated with 86High correlation
38 is highly correlated with 1High correlation
40 is highly correlated with 25 and 3 other fieldsHigh correlation
54 is highly correlated with 25 and 3 other fieldsHigh correlation
59 is highly correlated with 1 and 4 other fieldsHigh correlation
63 is highly correlated with 1 and 3 other fieldsHigh correlation
67 is highly correlated with 76 and 1 other fieldsHigh correlation
76 is highly correlated with 67 and 1 other fieldsHigh correlation
86 is highly correlated with 34High correlation
93 is highly correlated with 1 and 4 other fieldsHigh correlation
98 is highly correlated with 1 and 4 other fieldsHigh correlation
107 is highly correlated with 67 and 1 other fieldsHigh correlation
113 is highly correlated with 40 and 1 other fieldsHigh correlation
1 is highly correlated with 23 and 5 other fieldsHigh correlation
23 is highly correlated with 1 and 4 other fieldsHigh correlation
25 is highly correlated with 1 and 3 other fieldsHigh correlation
34 is highly correlated with 86High correlation
38 is highly correlated with 1High correlation
40 is highly correlated with 54High correlation
54 is highly correlated with 40 and 2 other fieldsHigh correlation
59 is highly correlated with 1 and 3 other fieldsHigh correlation
63 is highly correlated with 23 and 2 other fieldsHigh correlation
67 is highly correlated with 76High correlation
76 is highly correlated with 67High correlation
86 is highly correlated with 34High correlation
93 is highly correlated with 1 and 4 other fieldsHigh correlation
98 is highly correlated with 1 and 2 other fieldsHigh correlation
113 is highly correlated with 54High correlation
13 is highly correlated with 36 and 11 other fieldsHigh correlation
36 is highly correlated with 13 and 5 other fieldsHigh correlation
59 is highly correlated with 36 and 10 other fieldsHigh correlation
38 is highly correlated with 13 and 7 other fieldsHigh correlation
86 is highly correlated with 59 and 5 other fieldsHigh correlation
54 is highly correlated with 13 and 10 other fieldsHigh correlation
40 is highly correlated with 13 and 17 other fieldsHigh correlation
107 is highly correlated with 13 and 14 other fieldsHigh correlation
76 is highly correlated with 13 and 15 other fieldsHigh correlation
93 is highly correlated with 13 and 11 other fieldsHigh correlation
113 is highly correlated with 13 and 9 other fieldsHigh correlation
52 is highly correlated with 13 and 7 other fieldsHigh correlation
67 is highly correlated with 13 and 15 other fieldsHigh correlation
25 is highly correlated with 13 and 15 other fieldsHigh correlation
3 is highly correlated with 107High correlation
63 is highly correlated with 36 and 11 other fieldsHigh correlation
23 is highly correlated with 59 and 7 other fieldsHigh correlation
90 is highly correlated with 40 and 7 other fieldsHigh correlation
1 is highly correlated with 36 and 10 other fieldsHigh correlation
34 is highly correlated with 86 and 5 other fieldsHigh correlation
98 is highly correlated with 13 and 16 other fieldsHigh correlation
36 is highly correlated with 54 and 1 other fieldsHigh correlation
59 is highly correlated with 23 and 2 other fieldsHigh correlation
52 is highly correlated with 93 and 1 other fieldsHigh correlation
38 is highly correlated with 54 and 3 other fieldsHigh correlation
86 is highly correlated with 34 and 1 other fieldsHigh correlation
54 is highly correlated with 36 and 4 other fieldsHigh correlation
9 is highly correlated with 85High correlation
63 is highly correlated with 54 and 4 other fieldsHigh correlation
23 is highly correlated with 59 and 5 other fieldsHigh correlation
1 is highly correlated with 59 and 5 other fieldsHigh correlation
90 is highly correlated with 93 and 1 other fieldsHigh correlation
34 is highly correlated with 86 and 1 other fieldsHigh correlation
93 is highly correlated with 52 and 6 other fieldsHigh correlation
85 is highly correlated with 36 and 12 other fieldsHigh correlation

Reproduction

Analysis started2021-09-16 09:57:18.188879
Analysis finished2021-09-16 09:57:42.459309
Duration24.27 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

1
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
2
4208 
1
3915 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters8123
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row1
4th row2
5th row2

Common Values

ValueCountFrequency (%)
24208
51.8%
13915
48.2%

Length

2021-09-16T15:57:42.761496image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-16T15:57:42.874195image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
24208
51.8%
13915
48.2%

Most occurring characters

ValueCountFrequency (%)
24208
51.8%
13915
48.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number8123
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
24208
51.8%
13915
48.2%

Most occurring scripts

ValueCountFrequency (%)
Common8123
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
24208
51.8%
13915
48.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII8123
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
24208
51.8%
13915
48.2%

3
Real number (ℝ≥0)

HIGH CORRELATION

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.63781854
Minimum3
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size63.6 KiB
2021-09-16T15:57:42.976958image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile3
Q13
median4
Q36
95-th percentile7
Maximum8
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.588963341
Coefficient of variation (CV)0.3426100714
Kurtosis-1.773377011
Mean4.63781854
Median Absolute Deviation (MAD)1
Skewness0.08776706125
Sum37673
Variance2.524804499
MonotonicityNot monotonic
2021-09-16T15:57:43.126521image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
33655
45.0%
63152
38.8%
7828
 
10.2%
4452
 
5.6%
532
 
0.4%
84
 
< 0.1%
ValueCountFrequency (%)
33655
45.0%
4452
 
5.6%
532
 
0.4%
63152
38.8%
7828
 
10.2%
84
 
< 0.1%
ValueCountFrequency (%)
84
 
< 0.1%
7828
 
10.2%
63152
38.8%
532
 
0.4%
4452
 
5.6%
33655
45.0%

9
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
10
3244 
9
2555 
11
2320 
12
 
4

Length

Max length2
Median length2
Mean length1.685461037
Min length1

Characters and Unicode

Total characters13691
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9
2nd row9
3rd row10
4th row9
5th row10

Common Values

ValueCountFrequency (%)
103244
39.9%
92555
31.5%
112320
28.6%
124
 
< 0.1%

Length

2021-09-16T15:57:43.469795image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-16T15:57:43.695192image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
103244
39.9%
92555
31.5%
112320
28.6%
124
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
17888
57.6%
03244
23.7%
92555
 
18.7%
24
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number13691
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
17888
57.6%
03244
23.7%
92555
 
18.7%
24
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common13691
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
17888
57.6%
03244
23.7%
92555
 
18.7%
24
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII13691
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
17888
57.6%
03244
23.7%
92555
 
18.7%
24
 
< 0.1%

13
Real number (ℝ≥0)

HIGH CORRELATION

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.09380771
Minimum13
Maximum22
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size63.6 KiB
2021-09-16T15:57:43.823876image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum13
5-th percentile13
Q113
median15
Q316
95-th percentile17
Maximum22
Range9
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.744746874
Coefficient of variation (CV)0.1155935539
Kurtosis-0.1417262653
Mean15.09380771
Median Absolute Deviation (MAD)2
Skewness0.4445257779
Sum122607
Variance3.044141656
MonotonicityNot monotonic
2021-09-16T15:57:43.933583image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
132283
28.1%
161840
22.7%
171500
18.5%
141072
13.2%
151040
12.8%
19168
 
2.1%
18144
 
1.8%
2144
 
0.5%
2016
 
0.2%
2216
 
0.2%
ValueCountFrequency (%)
132283
28.1%
141072
13.2%
151040
12.8%
161840
22.7%
171500
18.5%
18144
 
1.8%
19168
 
2.1%
2016
 
0.2%
2144
 
0.5%
2216
 
0.2%
ValueCountFrequency (%)
2216
 
0.2%
2144
 
0.5%
2016
 
0.2%
19168
 
2.1%
18144
 
1.8%
171500
18.5%
161840
22.7%
151040
12.8%
141072
13.2%
132283
28.1%

23
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
24
4748 
23
3375 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters16246
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row23
2nd row23
3rd row23
4th row24
5th row23

Common Values

ValueCountFrequency (%)
244748
58.5%
233375
41.5%

Length

2021-09-16T15:57:44.234749image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-16T15:57:44.338472image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
244748
58.5%
233375
41.5%

Most occurring characters

ValueCountFrequency (%)
28123
50.0%
44748
29.2%
33375
20.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number16246
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
28123
50.0%
44748
29.2%
33375
20.8%

Most occurring scripts

ValueCountFrequency (%)
Common16246
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
28123
50.0%
44748
29.2%
33375
20.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII16246
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
28123
50.0%
44748
29.2%
33375
20.8%

25
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.58980672
Minimum25
Maximum33
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size63.6 KiB
2021-09-16T15:57:44.436211image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum25
5-th percentile26
Q128
median28
Q329
95-th percentile32
Maximum33
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.557295941
Coefficient of variation (CV)0.0544703207
Kurtosis0.6671400683
Mean28.58980672
Median Absolute Deviation (MAD)1
Skewness0.4437250173
Sum232235
Variance2.425170647
MonotonicityNot monotonic
2021-09-16T15:57:44.566889image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
283528
43.4%
292160
26.6%
32576
 
7.1%
31576
 
7.1%
26400
 
4.9%
27400
 
4.9%
25255
 
3.1%
30192
 
2.4%
3336
 
0.4%
ValueCountFrequency (%)
25255
 
3.1%
26400
 
4.9%
27400
 
4.9%
283528
43.4%
292160
26.6%
30192
 
2.4%
31576
 
7.1%
32576
 
7.1%
3336
 
0.4%
ValueCountFrequency (%)
3336
 
0.4%
32576
 
7.1%
31576
 
7.1%
30192
 
2.4%
292160
26.6%
283528
43.4%
27400
 
4.9%
26400
 
4.9%
25255
 
3.1%

34
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
34
7913 
35
 
210

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters16246
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row34
2nd row34
3rd row34
4th row34
5th row34

Common Values

ValueCountFrequency (%)
347913
97.4%
35210
 
2.6%

Length

2021-09-16T15:57:44.951130image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-16T15:57:45.077783image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
347913
97.4%
35210
 
2.6%

Most occurring characters

ValueCountFrequency (%)
38123
50.0%
47913
48.7%
5210
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number16246
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
38123
50.0%
47913
48.7%
5210
 
1.3%

Most occurring scripts

ValueCountFrequency (%)
Common16246
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
38123
50.0%
47913
48.7%
5210
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII16246
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
38123
50.0%
47913
48.7%
5210
 
1.3%

36
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
36
6811 
37
1312 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters16246
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row36
2nd row36
3rd row36
4th row37
5th row36

Common Values

ValueCountFrequency (%)
366811
83.8%
371312
 
16.2%

Length

2021-09-16T15:57:45.391943image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-16T15:57:45.515650image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
366811
83.8%
371312
 
16.2%

Most occurring characters

ValueCountFrequency (%)
38123
50.0%
66811
41.9%
71312
 
8.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number16246
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
38123
50.0%
66811
41.9%
71312
 
8.1%

Most occurring scripts

ValueCountFrequency (%)
Common16246
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
38123
50.0%
66811
41.9%
71312
 
8.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII16246
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
38123
50.0%
66811
41.9%
71312
 
8.1%

38
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
39
5612 
38
2511 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters16246
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row39
2nd row39
3rd row38
4th row39
5th row39

Common Values

ValueCountFrequency (%)
395612
69.1%
382511
30.9%

Length

2021-09-16T15:57:45.814840image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-16T15:57:45.923551image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
395612
69.1%
382511
30.9%

Most occurring characters

ValueCountFrequency (%)
38123
50.0%
95612
34.5%
82511
 
15.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number16246
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
38123
50.0%
95612
34.5%
82511
 
15.5%

Most occurring scripts

ValueCountFrequency (%)
Common16246
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
38123
50.0%
95612
34.5%
82511
 
15.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII16246
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
38123
50.0%
95612
34.5%
82511
 
15.5%

40
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44.27477533
Minimum40
Maximum51
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size63.6 KiB
2021-09-16T15:57:46.034255image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum40
5-th percentile40.1
Q142
median44
Q346
95-th percentile48
Maximum51
Range11
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.669395287
Coefficient of variation (CV)0.06029156031
Kurtosis-0.9179856198
Mean44.27477533
Median Absolute Deviation (MAD)2
Skewness0.3344192409
Sum359644
Variance7.125671197
MonotonicityNot monotonic
2021-09-16T15:57:46.157922image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
481728
21.3%
431492
18.4%
441202
14.8%
411048
12.9%
42752
9.3%
45732
9.0%
46492
 
6.1%
40407
 
5.0%
4796
 
1.2%
5086
 
1.1%
Other values (2)88
 
1.1%
ValueCountFrequency (%)
40407
 
5.0%
411048
12.9%
42752
9.3%
431492
18.4%
441202
14.8%
45732
9.0%
46492
 
6.1%
4796
 
1.2%
481728
21.3%
4924
 
0.3%
ValueCountFrequency (%)
5164
 
0.8%
5086
 
1.1%
4924
 
0.3%
481728
21.3%
4796
 
1.2%
46492
 
6.1%
45732
9.0%
441202
14.8%
431492
18.4%
42752
9.3%

52
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
53
4608 
52
3515 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters16246
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row52
2nd row52
3rd row52
4th row53
5th row52

Common Values

ValueCountFrequency (%)
534608
56.7%
523515
43.3%

Length

2021-09-16T15:57:46.468295image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-16T15:57:46.578971image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
534608
56.7%
523515
43.3%

Most occurring characters

ValueCountFrequency (%)
58123
50.0%
34608
28.4%
23515
21.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number16246
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
58123
50.0%
34608
28.4%
23515
21.6%

Most occurring scripts

ValueCountFrequency (%)
Common16246
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
58123
50.0%
34608
28.4%
23515
21.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII16246
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
58123
50.0%
34608
28.4%
23515
21.6%

54
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
56
3776 
58
2480 
54
1119 
55
556 
57
 
192

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters16246
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row55
2nd row55
3rd row54
4th row54
5th row55

Common Values

ValueCountFrequency (%)
563776
46.5%
582480
30.5%
541119
 
13.8%
55556
 
6.8%
57192
 
2.4%

Length

2021-09-16T15:57:46.865206image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-16T15:57:46.972945image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
563776
46.5%
582480
30.5%
541119
 
13.8%
55556
 
6.8%
57192
 
2.4%

Most occurring characters

ValueCountFrequency (%)
58679
53.4%
63776
23.2%
82480
 
15.3%
41119
 
6.9%
7192
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number16246
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
58679
53.4%
63776
23.2%
82480
 
15.3%
41119
 
6.9%
7192
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
Common16246
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
58679
53.4%
63776
23.2%
82480
 
15.3%
41119
 
6.9%
7192
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII16246
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
58679
53.4%
63776
23.2%
82480
 
15.3%
41119
 
6.9%
7192
 
1.2%

59
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
59
5175 
61
2372 
60
552 
62
 
24

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters16246
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row59
2nd row59
3rd row59
4th row59
5th row59

Common Values

ValueCountFrequency (%)
595175
63.7%
612372
29.2%
60552
 
6.8%
6224
 
0.3%

Length

2021-09-16T15:57:47.310048image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-16T15:57:47.423750image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
595175
63.7%
612372
29.2%
60552
 
6.8%
6224
 
0.3%

Most occurring characters

ValueCountFrequency (%)
55175
31.9%
95175
31.9%
62948
18.1%
12372
14.6%
0552
 
3.4%
224
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number16246
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
55175
31.9%
95175
31.9%
62948
18.1%
12372
14.6%
0552
 
3.4%
224
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common16246
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
55175
31.9%
95175
31.9%
62948
18.1%
12372
14.6%
0552
 
3.4%
224
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII16246
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
55175
31.9%
95175
31.9%
62948
18.1%
12372
14.6%
0552
 
3.4%
224
 
0.1%

63
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
63
4935 
66
2304 
64
600 
65
 
284

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters16246
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row63
2nd row63
3rd row63
4th row63
5th row63

Common Values

ValueCountFrequency (%)
634935
60.8%
662304
28.4%
64600
 
7.4%
65284
 
3.5%

Length

2021-09-16T15:57:47.751835image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-16T15:57:47.857149image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
634935
60.8%
662304
28.4%
64600
 
7.4%
65284
 
3.5%

Most occurring characters

ValueCountFrequency (%)
610427
64.2%
34935
30.4%
4600
 
3.7%
5284
 
1.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number16246
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
610427
64.2%
34935
30.4%
4600
 
3.7%
5284
 
1.7%

Most occurring scripts

ValueCountFrequency (%)
Common16246
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
610427
64.2%
34935
30.4%
4600
 
3.7%
5284
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII16246
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
610427
64.2%
34935
30.4%
4600
 
3.7%
5284
 
1.7%

67
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean68.14982149
Minimum67
Maximum75
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size63.6 KiB
2021-09-16T15:57:47.985806image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum67
5-th percentile67
Q167
median67
Q369
95-th percentile71
Maximum75
Range8
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.563585687
Coefficient of variation (CV)0.02294335705
Kurtosis1.733343832
Mean68.14982149
Median Absolute Deviation (MAD)0
Skewness1.429409741
Sum553581
Variance2.444800202
MonotonicityNot monotonic
2021-09-16T15:57:48.101506image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
674463
54.9%
691872
23.0%
68576
 
7.1%
70448
 
5.5%
71432
 
5.3%
73192
 
2.4%
7296
 
1.2%
7436
 
0.4%
758
 
0.1%
ValueCountFrequency (%)
674463
54.9%
68576
 
7.1%
691872
23.0%
70448
 
5.5%
71432
 
5.3%
7296
 
1.2%
73192
 
2.4%
7436
 
0.4%
758
 
0.1%
ValueCountFrequency (%)
758
 
0.1%
7436
 
0.4%
73192
 
2.4%
7296
 
1.2%
71432
 
5.3%
70448
 
5.5%
691872
23.0%
68576
 
7.1%
674463
54.9%

76
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77.06167672
Minimum76
Maximum84
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size63.6 KiB
2021-09-16T15:57:48.247107image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum76
5-th percentile76
Q176
median76
Q377
95-th percentile80
Maximum84
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.648654618
Coefficient of variation (CV)0.02139396245
Kurtosis3.834021306
Mean77.06167672
Median Absolute Deviation (MAD)0
Skewness1.988362404
Sum625972
Variance2.718062049
MonotonicityNot monotonic
2021-09-16T15:57:48.368754image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
764383
54.0%
771872
23.0%
78576
 
7.1%
80512
 
6.3%
79432
 
5.3%
83192
 
2.4%
8196
 
1.2%
8436
 
0.4%
8224
 
0.3%
ValueCountFrequency (%)
764383
54.0%
771872
23.0%
78576
 
7.1%
79432
 
5.3%
80512
 
6.3%
8196
 
1.2%
8224
 
0.3%
83192
 
2.4%
8436
 
0.4%
ValueCountFrequency (%)
8436
 
0.4%
83192
 
2.4%
8224
 
0.3%
8196
 
1.2%
80512
 
6.3%
79432
 
5.3%
78576
 
7.1%
771872
23.0%
764383
54.0%

85
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
85
8123 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters16246
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row85
2nd row85
3rd row85
4th row85
5th row85

Common Values

ValueCountFrequency (%)
858123
100.0%

Length

2021-09-16T15:57:48.679962image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-16T15:57:48.784669image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
858123
100.0%

Most occurring characters

ValueCountFrequency (%)
88123
50.0%
58123
50.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number16246
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
88123
50.0%
58123
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common16246
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
88123
50.0%
58123
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII16246
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
88123
50.0%
58123
50.0%

86
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
86
7923 
87
 
96
88
 
96
89
 
8

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters16246
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row86
2nd row86
3rd row86
4th row86
5th row86

Common Values

ValueCountFrequency (%)
867923
97.5%
8796
 
1.2%
8896
 
1.2%
898
 
0.1%

Length

2021-09-16T15:57:49.052924image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-16T15:57:49.155677image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
867923
97.5%
8796
 
1.2%
8896
 
1.2%
898
 
0.1%

Most occurring characters

ValueCountFrequency (%)
88219
50.6%
67923
48.8%
796
 
0.6%
98
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number16246
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
88219
50.6%
67923
48.8%
796
 
0.6%
98
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common16246
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
88219
50.6%
67923
48.8%
796
 
0.6%
98
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII16246
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
88219
50.6%
67923
48.8%
796
 
0.6%
98
 
< 0.1%

90
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
90
7487 
91
 
600
92
 
36

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters16246
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row90
2nd row90
3rd row90
4th row90
5th row90

Common Values

ValueCountFrequency (%)
907487
92.2%
91600
 
7.4%
9236
 
0.4%

Length

2021-09-16T15:57:49.517501image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-16T15:57:49.626210image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
907487
92.2%
91600
 
7.4%
9236
 
0.4%

Most occurring characters

ValueCountFrequency (%)
98123
50.0%
07487
46.1%
1600
 
3.7%
236
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number16246
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
98123
50.0%
07487
46.1%
1600
 
3.7%
236
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common16246
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
98123
50.0%
07487
46.1%
1600
 
3.7%
236
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII16246
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
98123
50.0%
07487
46.1%
1600
 
3.7%
236
 
0.2%

93
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
93
3967 
94
2776 
95
1296 
96
 
48
97
 
36

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters16246
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row93
2nd row93
3rd row93
4th row94
5th row93

Common Values

ValueCountFrequency (%)
933967
48.8%
942776
34.2%
951296
 
16.0%
9648
 
0.6%
9736
 
0.4%

Length

2021-09-16T15:57:49.956290image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-16T15:57:50.221619image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
933967
48.8%
942776
34.2%
951296
 
16.0%
9648
 
0.6%
9736
 
0.4%

Most occurring characters

ValueCountFrequency (%)
98123
50.0%
33967
24.4%
42776
 
17.1%
51296
 
8.0%
648
 
0.3%
736
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number16246
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
98123
50.0%
33967
24.4%
42776
 
17.1%
51296
 
8.0%
648
 
0.3%
736
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common16246
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
98123
50.0%
33967
24.4%
42776
 
17.1%
51296
 
8.0%
648
 
0.3%
736
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII16246
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
98123
50.0%
33967
24.4%
42776
 
17.1%
51296
 
8.0%
648
 
0.3%
736
 
0.2%

98
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean100.2011572
Minimum98
Maximum106
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size63.6 KiB
2021-09-16T15:57:50.373202image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum98
5-th percentile98
Q199
median101
Q3102
95-th percentile102
Maximum106
Range8
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.742161858
Coefficient of variation (CV)0.01738664409
Kurtosis-0.7562887146
Mean100.2011572
Median Absolute Deviation (MAD)1
Skewness0.2385298297
Sum813934
Variance3.035127939
MonotonicityNot monotonic
2021-09-16T15:57:50.513802image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1022388
29.4%
991968
24.2%
981871
23.0%
1011632
20.1%
10372
 
0.9%
10048
 
0.6%
10448
 
0.6%
10548
 
0.6%
10648
 
0.6%
ValueCountFrequency (%)
981871
23.0%
991968
24.2%
10048
 
0.6%
1011632
20.1%
1022388
29.4%
10372
 
0.9%
10448
 
0.6%
10548
 
0.6%
10648
 
0.6%
ValueCountFrequency (%)
10648
 
0.6%
10548
 
0.6%
10448
 
0.6%
10372
 
0.9%
1022388
29.4%
1011632
20.1%
10048
 
0.6%
991968
24.2%
981871
23.0%

107
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean109.6881694
Minimum107
Maximum112
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size63.6 KiB
2021-09-16T15:57:50.653427image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum107
5-th percentile107
Q1109.5
median110
Q3111
95-th percentile111
Maximum112
Range5
Interquartile range (IQR)1.5

Descriptive statistics

Standard deviation1.380963064
Coefficient of variation (CV)0.01258989982
Kurtosis-0.1799471394
Mean109.6881694
Median Absolute Deviation (MAD)1
Skewness-0.8456876591
Sum890997
Variance1.907058985
MonotonicityNot monotonic
2021-09-16T15:57:50.794077image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1104040
49.7%
1111712
21.1%
1071247
 
15.4%
108400
 
4.9%
109384
 
4.7%
112340
 
4.2%
ValueCountFrequency (%)
1071247
 
15.4%
108400
 
4.9%
109384
 
4.7%
1104040
49.7%
1111712
21.1%
112340
 
4.2%
ValueCountFrequency (%)
112340
 
4.2%
1111712
21.1%
1104040
49.7%
109384
 
4.7%
108400
 
4.9%
1071247
 
15.4%

113
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean115.7950265
Minimum113
Maximum119
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size63.6 KiB
2021-09-16T15:57:50.926695image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum113
5-th percentile114
Q1114
median116
Q3117
95-th percentile119
Maximum119
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.617350789
Coefficient of variation (CV)0.01396735972
Kurtosis-0.4881195139
Mean115.7950265
Median Absolute Deviation (MAD)1
Skewness0.3170984273
Sum940603
Variance2.615823574
MonotonicityNot monotonic
2021-09-16T15:57:51.036402image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1163148
38.8%
1142148
26.4%
1171144
 
14.1%
119832
 
10.2%
113367
 
4.5%
115292
 
3.6%
118192
 
2.4%
ValueCountFrequency (%)
113367
 
4.5%
1142148
26.4%
115292
 
3.6%
1163148
38.8%
1171144
 
14.1%
118192
 
2.4%
119832
 
10.2%
ValueCountFrequency (%)
119832
 
10.2%
118192
 
2.4%
1171144
 
14.1%
1163148
38.8%
115292
 
3.6%
1142148
26.4%
113367
 
4.5%

Interactions

2021-09-16T15:57:25.294832image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:25.540146image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:25.731634image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:25.917139image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:26.127575image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:26.357961image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:26.573385image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:26.777864image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:26.934418image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:27.095017image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:27.244589image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:27.391197image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:27.558777image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:27.707351image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:27.870915image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:28.038466image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:28.227959image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:28.407538image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:28.595076image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:28.773560image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:28.963053image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:29.188450image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:29.388914image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:29.593367image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:29.791140image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:30.063411image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:30.241935image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:30.439415image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:30.612942image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:30.761573image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:30.934083image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:31.097646image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:31.289134image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:31.465690image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:31.639225image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:31.805753image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:32.014195image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:32.244580image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:32.515854image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:32.757215image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:32.955678image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:33.178084image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:33.415449image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:33.620899image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:33.823359image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:34.039808image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:34.243276image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:34.437717image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:34.635216image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:34.817727image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:35.009214image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:35.201701image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:35.401168image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:35.624543image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:35.828996image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:36.024473image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:36.292756image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:36.500201image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:36.689694image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:36.918084image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:37.122537image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:37.326157image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:37.515650image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:37.700184image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:37.856737image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:38.012323image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:38.189847image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:38.351453image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:38.533927image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:38.705468image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:38.878878image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:39.036494image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:39.199049image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:39.372558image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:39.543129image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:39.722653image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:39.895201image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:40.120595image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:40.353935image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:40.592296image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-09-16T15:57:40.809716image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-09-16T15:57:51.221906image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-09-16T15:57:51.670733image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-09-16T15:57:52.119505image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-09-16T15:57:52.576284image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-09-16T15:57:52.950312image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-09-16T15:57:41.174739image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-09-16T15:57:42.193017image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

139132325343638405254596367768586909398107113
0239142326343639405255596367768586909399108114
1249152327343639415255596367768586909399108115
21310152325343638415254596367768586909398107113
3239162428343739405354596367768586909499109114
42310142326343639415255596367768586909398108114
5249152326343639425255596367768586909398108115
62410152327343639415255596367768586909399107115
71310152325343638435254596367768586909398110114
8249142326343639425255596367768586909398107115
92310142327343639425255596367768586909399108114

Last rows

139132325343638405254596367768586909398107113
811316102124333536395052556165748485869297102112116
81142391324283536395052585963738385889093104110119
811517101324323436384853585966697685869094102110119
81161791724313436384853586163697685869094102110116
811717101324293436384853586163697685869094102110116
81182791324283536395052585963738385889093106112119
81192391324283536395052585963738385879093106110119
81202691324283536394152585963738385889093106112119
812117101324313436384853585966677685869094102110119
81222391324283536395052585963738385889093104112119